Science 


Haaas 


EXTENDED PDF FORMAT 
SPONSORED BY 

Travel Grants Available 

Neuroscience 2014 

Application Deadline: July 31 st 
>Apply Now R\Dsystems 

a biotechne brand 

www.rndsystems.com 

Coronavirus Main Proteinase (3CLP ro ) Structure: Basis for Design of 
Anti-SARS Drugs 

Kanchan Anand et al. 

Science 300 , 1763 (2003); 

DOI: 10.1126/science. 1085658 



This copy is for your personal, non-commercial use only. 


If you wish to distribute this article to others, you can order high-quality copies for your 
colleagues, clients, or customers by clicking here. 

Permission to republish or repurpose articles or portions of articles can be obtained by 
following the guidelines here. 

The following resources related to this article are available online at 
www.sciencemag.org (this information is current as of May 29, 2014 ): 

Updated information and services, including high-resolution figures, can be found in the online 
version of this article at: 

http://www.sciencemag.org/content/300/5626/1763.full.html 

Supporting Online Material can be found at: 

http://www.sciencemag.Org/content/suppl/2003/06/1 2/1 085658. DC1 .html 

This article cites 15 articles, 10 of which can be accessed free: 
http://www.sciencemag.Org/content/300/5626/1763.full.html#ref-list-1 

This article has been cited by 323 article(s) on the ISI Web of Science 

This article has been cited by 73 articles hosted by HighWire Press; see: 
http://www.sciencemag.Org/content/300/5626/1763.full.html#related-urls 

This article appears in the following subject collections: 

Biochemistry 

http://www.sciencemag.org/cgi/collection/biochem 


Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the 
American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. Copyright 
2003 by the American Association for the Advancement of Science; all rights reserved. The title Science is a 
registered trademark of AAAS. 


Downloaded from www.sciencemag.org on May 29, 2014 











Reports 


31. R. J. Krauzlis, C. S. Stone, Trends Neurosci. 22, 544 
(1999). 

32. C. L. Colby, J. R. Duhamel, M. E. Goldberg, Cereb. 
Cortex 5, 470 (1995). 

33. K. Nakamura, C. L. Colby, Proc. Natl. Acad. Sci. U.S.A. 
99, 4026 (2002). 

34. S. Ben Hamed, J. R. Duhamel, Exp. Brain Res. 142, 512 
(1998). 

35. B. Gaymard, C. J. Ploner, S. Rivaud, A. I. Vermersch, C. 
Pierrot-Deseilligny, Exp. Brain Res. 123, 159 (1998). 


Human coronaviruses (HCoVs) are major 
causes of upper respiratory tract illness in 
humans; in particular, the common cold (7). 
To date, only the 229E strain of HCoV has 
been characterized in detail, because it used 
to be the only isolate that grows efficiently 
in cell culture. It has recently been shown 
that a novel HCoV causes severe acute 
respiratory syndrome (SARS), a disease 
that is rapidly spreading from its likely 
origin in southern China to several coun¬ 
tries in other parts of the world (2, 3). 
SARS is characterized by high fever, mal¬ 
aise, rigor, headache, and nonproductive 
cough or dyspnea and may progress to gen¬ 
eralized interstitial infiltrates in the lung, 
requiring intubation and mechanical venti¬ 
lation (4). The fatality rate among people 
with illness meeting the current definition 
of SARS is presently around 15% [calcu¬ 
lated as deaths/(deaths + surviving pa¬ 
tients)]. Epidemiological evidence suggests 
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that the transmission of this newly emerg¬ 
ing pathogen occurs mainly by face-to-face 
contact, although other routes of transmis¬ 
sion cannot be fully excluded. By 9 May 
2003, more than 7000 cases of SARS had 
been diagnosed worldwide, with the num¬ 
bers still rapidly increasing. At present, no 
efficacious therapy is available. 

Coronaviruses are positive-stranded RNA 
viruses featuring the largest viral RNA ge¬ 
nomes known to date (27 to 31 kb). The gene 
for the human coronavirus 229E replicase, 
encompassing more than 20,000 nucleotides, 
encodes two overlapping polyproteins [ppla 
(replicase la, —450 kD) and pplab (replicase 
lab, —750 kD) (5)] that mediate all the func¬ 
tions required for viral replication and tran¬ 
scription (6). Expression of the C-proximal 
portion of pplab requires (-1) ribosomal 
frameshifting (5). The functional polypep¬ 
tides are released from the polyproteins by 
extensive proteolytic processing. This is pri¬ 
marily achieved by the 33.1-kD HCoV 229E 
main proteinase (M pro ) (7), which is fre¬ 
quently also called 3C-like proteinase 
(3CL pro ) to indicate a similarity of its cleav¬ 
age-site specificity to that observed for picor- 
navirus 3C proteinases [3C pro (table SI)], 
although we have recently shown that the 
structural similarities between the two fami¬ 
lies of proteinases are limited (5). The M pro 
(3CL pro ) cleaves the polyprotein at no less 
than 11 conserved sites involving Leu- 
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Gin l (Ser,Ala,Gly) sequences (the cleavage 
site is indicated by j), a process initiated by 
the enzyme’s own autolytic cleavage from ppla 
and pplab (9, 10). This cleavage pattern ap¬ 
pears to be conserved in the M pro from SARS 
coronavirus (SARS-CoV), as we deduced from 
the genomic sequence published recently (77, 
12) and prove experimentally here for one 
cleavage site (see below). The SARS-CoV 
polyproteins have three noncanonical M pro 
cleavage sites with Phe, Met, or Val in the P2 
position, but the same cleavage sites are unusu¬ 
al in other coronaviruses as well. The functional 
importance of M pro in the viral life cycle makes 
this proteinase an attractive target for the devel¬ 
opment of drugs directed against SARS and 
other coronavirus infections. 

Here we report three three-dimensional 
(3D) structures of coronavirus M pro s, which 
together form a solid basis for inhibitor de¬ 
sign: (i) the crystal structure, at 2.54 A reso¬ 
lution, of the free enzyme of human corona¬ 
virus (strain 229E) M pro ; (ii) a homology 
model of SARS-CoV M pro , based on the 
crystal structure of HCoV 229E M pro de¬ 
scribed here and on that of the homologous 
enzyme of the related porcine transmissible 
gastroenteritis (corona)virus (TGEV), which 
we determined previously (#); and (iii) the 
2.37 A crystal structure of a complex between 
TGEV M pro and a substrate-analog hexapep- 
tidyl chloromethyl ketone (CMK) inhibitor. 
Comparison of the structures shows that the 
substrate-binding sites are well conserved 
among coronavirus main proteinases. This is 
supported by our experimental finding that 
recombinant SARS-CoV M pro cleaves a pep¬ 
tide corresponding to the N-terminal auto¬ 
cleavage site of TGEV M pro . Further, we 
find the binding mode of the hexapeptidyl 
inhibitor to be similar to that seen in the 
distantly related human rhino virus 3C pro¬ 
teinase (3C pro ) (73). On the basis of the com¬ 
bined structural information, a prototype in¬ 
hibitor is proposed that should block M pro s 
and thus be a suitable drug for targeting 
coronavirus infections, including SARS. 

The 2.54 A crystal structure of HCoV 
229E M pro (14) shows that the molecule 
comprises three domains (Fig. 1A). Domains 
I and II (residues 8 to 99 and 100 to 183, 
respectively) are six-stranded antiparallel (3 
barrels and together resemble the architecture 
of chymotrypsin and of picornavirus 3C pro¬ 
teinases. The substrate-binding site is located 
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in a cleft between these two domains. A long 
loop (residues 184 to 199) connects domain II 
to the C-terminal domain (domain III, resi¬ 
dues 200 to 300). This latter domain, a glob¬ 
ular cluster of five helices, has been impli¬ 
cated in the proteolytic activity of M pro (75). 
The HCoV 229E M pro structure is very sim¬ 
ilar to that of TGEV M pro (5). The root mean 
square (rms) deviation between the two struc¬ 
tures is ~ 1.5 A for all 300 Ca positions of the 
molecule (16), but the isolated domains ex¬ 
hibit rms deviations of only —0.8 A. HCoV 
229E and TGEV are both group I coronavi- 
ruses (77), and their main proteinases share 
61% sequence identity. 

For comparison of its enzymatic proper¬ 
ties with those of the HCoV and TGEV 
M pro s, we expressed SARS-CoV (strain 


TOR2) M pro in Escherichia coli (18) and 
preliminarily characterized the proteinase. 
The amino acid sequence of SARS-CoV M pro 
displays 40 and 44% sequence identity to 
HCoV 229E M pro and TGEV M pro , respec¬ 
tively (see Fig. IB for a structure-based align¬ 
ment). Identity levels are 50 and 49%, respec¬ 
tively, between SARS-CoV M pro and the cor¬ 
responding proteinases from the group II 
corona viruses: mouse hepatitis virus (MHV) 
and bovine corona virus (BCoV). Finally, 
SARS-CoV M pro shares 39% sequence iden¬ 
tity with avian infectious bronchitis virus 
(IBV) M pro , the only group III coronavirus 
for which a main proteinase sequence is 
available. These data are in agreement with 
the conclusion deducible from the sequence 
of the whole SARS-CoV genome (77, 12) 


that the new virus is most similar to group II 
coronaviruses, although some common fea¬ 
tures with IBV (group III) can also be detect¬ 
ed. Others have defined SARS-CoV as the 
first member of a new group IV (77). 

The level of similarity among SARS- 
CoV, HCoV 229E, and TGEV M pro s, al¬ 
lowed us to construct a reliable 3D model of 
SARS-CoV M pro (Fig. 1C). There are three 
one- or two-residue insertions in SARS-CoV 
M pro , relative to the structural templates; as is 
to be expected, these are all located in loops 
and do not present a problem in model build¬ 
ing. Interestingly, domains I and II show a 
higher degree of sequence conservation (42 
to 48% identity) than does domain III (36 to 
40%) between SARS-CoV M pro and the 
coronavirus group I enzymes. 




HCoV ST LQAGLRKMA-Q FSGFVEKCWRVC Y GW TVLNGLWLGDIVYC PRHVIAS-N T T S AIDY DHE Y5IMRL HNFSI IS GT-A FLG WG ATMH 
TGEV STLQS GLRKMAQ PSG LVE PCIVRVSYGNNVLN GLWLGDEVIC PRHVI AS - DTT RVIN YENEMS SVRLHN FSVEKNN- VFLG W5ARYK 
MHV S FLQSGIVKMVSPTSK VEPCWS VT YGNMTLNGLWLDDK VYC PR1VICS SADMTDPD YPNLLCRVTSS D FCVMS DR^MSLTVMS YQMQ 
BC oV S F LQ S GIVKMVTJ P T'S K VE PC I VS VTY GN MT LN GLWLDDKVYC PRHVIC S AS DMTNP DYTNLLC RVTS S D FT VL FDR- LSLT VMS YQMQ 
SJULS-EdY A V LQ S G FRKMAF PSG K VEGCMVQ VTCGT TTLKGLWLDDTVYC PRHVI CTAE DMLNPMYE DLL IRKS NHSFLVQAGN -VQLRVIGHSMQ 
IBV 3 RLQ 5 G FKKLVS P 3 SAVEKCI VSVSY RGNN LN G1.1N1G DTIYC PRHV LGK FSGBQWN DVLN LANNHE FEVTT QH GVTLWWS RRL* 


Fig. 1. 3D structure of coronavirus M pro . (A) Monomer of HCoV M pro . Domains I (top), II, and III (bottom) are 
indicated. Helices are red and strands are green, a helices are labeled A to F according to occurence along the primary 
structure, with the additional one-turn A' a helix in the N-terminal segment (residues 11 to 14). (3 strands are labeled 
a to f, followed by an indication of the domain to which they belong (I or II). The N and C termini are labeled N and 
C, respectively. Residues of the catalytic dyad, Cys 144 and His 41 , are indicated. (B) Structure-based sequence 
alignment of the main proteinases of coronaviruses from all three groups. HCoV, human coronavirus 229E (group 
I); TGEV, porcine transmissible gastroenteritis virus (group I); MHV, mouse hepatitis virus (group II); BCoV, bovine 
coronavirus (group II); SARS-CoV, SARS coronavirus (between groups II and III); IBV, avian infectious bronchitis virus 
(group III). The autocleavage sites of the proteinases are marked by vertical arrows above the sequences. In addition 
to the sequences of the mature enzymes, four residues each of the viral polyprotein N-terminal to the first and 
C-terminal to the second autocleavage site are shown. Note the conservation of the cleavage pattern, (small)-Xaa- 
Leu-Gln | (Ala,Ser,Gly). Thick bars above the sequences indicate a helices (labeled A' and A to F); horizontal arrows 
indicate (3 strands (labeled a to f, followed by the domain to which they belong). Residue numbers for HCoV M pro 
are given below the sequence; three-digit numbers are centered about the residue labeled. Symbols in the second 
row below the alignment mark residues involved in dimerization of HCoV and TGEV M pro : open circle (o) indicates 
only main chain involved; asterisk (*) indicates only side chain involved; plus (+) indicates both main chain and side 
chain involved. From the almost absolute conservation of side chains involved in dimerization, it can be concluded 
that SARS-CoV M pro also has the capacity to form dimers. In addition, side chains involved in inhibitor binding in the 
TGEV M pro complex are indicated by triangles (A), and catalytic-site residues Cys 144 and His 41 as well as the 
conserved Y 160 MH 162 motif are shaded. (C) Ca plot of a monomer of SARS-CoV M pro as model-built on the basis 
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HCoV M pro forms a tight dimer in the 
crystal (the contact interface, which is pre¬ 
dominantly between domain II of molecule A 
and the N-terminal residues of molecule B, is 
— 1300 A 2 ), with the two molecules oriented 
perpendicular to one another (Fig. 2). Our 
previous crystal structure of the TGEV M pro 
(5) revealed the same type of dimer. We 
could show by dynamic light scattering that 
both HCoV 229E and TGEV M pro exist as a 
mixture of monomers (—65%) and dimers 
(—35%) in diluted solutions (1 to 2 mg of 
proteinase/ml). However, because the archi¬ 
tecture of the dimers, including most of the 
details of intermolecular interaction, is the 
same in both TGEV M pro (three independent 
dimers per asymmetric unit) and HCoV 229E 
M pro (one dimer per asymmetric unit)—that 



Fig. 2. Dimer of HCoV M pro . The N-terminal 
residues of each chain squeeze between do¬ 
mains II and III of the parent monomer and 
domain II of the other monomer. N and C 
termini are labeled by cyan and magenta 
spheres and the letters N and C, respectively. 


is, in completely different crystalline envi¬ 
ronments—we believe that dimer formation 
is of biological relevance in these enzymes. 
In the M pro dimer, the N-terminal amino acid 
residues are squeezed in between domains II 
and III of the parent monomer and domain II 
of the other monomer, where they make a 
number of very specific interactions that ap¬ 
pear tailor-made to bind this segment with 
high affinity after autocleavage. This mecha¬ 
nism would immediately enable the catalytic 
site to act on other cleavage sites in the 
polyprotein. However, the exact placement of 
the N terminus also seems to have a structural 
role for the mature M pro , because deletion of 
residues 1 to 5 lead to a decrease in activity to 
0.3% in the standard peptide-substrate assay 
(< 8 ). Nearly all side chains of TGEV M pro and 
HCoV 229E M pro involved in the formation 
of this dimer (marked in Fig. IB) are con¬ 
served in the SARS-CoV enzyme, so it is safe 
to assume a dimerization capacity for the 
latter as well. 

In the active site of HCoV 229E M pro , 
Cys 144 and His 41 form a catalytic dyad. In 
contrast to serine proteinases and other cys¬ 
teine proteinases, which have a catalytic tri¬ 
ad, there is no third catalytic residue present. 
HCoV 229E M pro has Val 84 in the corre¬ 
sponding position (Cys in SARS-CoV M pro ), 
with its side chain pointing away from the 
active site. A buried water molecule is found 
in the place that would normally be occupied 
by the third member of the triad; this water is 
hydrogen-bonded to His 41 NS1, Gin 163 Ne2, 
and Asp 186 081 (His, His, and Asp in both 
SARS-CoV and TGEV M pro ). 

To allow structure-based design of 
drugs directed at coronavirus M pro s, we 
sought to determine the exact binding mode 
of M pro substrates. To this end, we synthe¬ 


sized the substrate-analog CMK inhibitor 
Cbz-Val-Asn-Ser-Thr-Leu-Gln-CMK and 
soaked it into crystals of TGEV M pro , be¬ 
cause these were of better quality and dif¬ 
fracted to higher resolution than those of 
HCoV 229E M pro . The sequence of the 
inhibitor was derived from residues P6 to 
PI of the N-terminal autoprocessing site of 
TGEV M pro [SARS-CoV M pro and HCoV 
229E M pro have Thr-Ser-Ala-Val-Leu-Gln 
and Tyr-Gly-Ser-Thr-Leu-Gln, respective¬ 
ly, at the corresponding positions (Fig. 
IB)]. X-ray crystallographic analysis at 
2.37 A resolution (19) revealed difference 
density for all residues [except the benzyl- 
oxycarbonyl (Cbz) protective group] of the in¬ 
hibitor in two molecules (B and F) out of the six 
TGEV M pro monomers in the asymmetric unit 
(Fig. 3A). In these monomers, there is a cova¬ 
lent bond between the Sy atom of Cys 144 and 
the methylene group of the CMK. 

There are no substantial differences be¬ 
tween the structures of the enzyme in the free 
and in the complexed state. The substrate- 
analog inhibitor binds in the shallow 
substrate-binding site at the surface of the 
proteinase, between domains I and II (Fig. 
3A). The residues Val-Asn-Ser-Thr-Leu-Gln 
occupy, and thereby define, the subsites S6 to 
SI of the proteinase. Residues P5 to P3 form 
an antiparallel (3 sheet with segment 164 to 
167 of the long strand ell on one side, and 
they also interact with segment 189 to 191 of 
the loop linking domains II and III on the 
other (Fig. 3A). The functional importance of 
this latter interaction is supported by the com¬ 
plete loss of proteolytic activity upon deletion 
of the loop region in TGEV M pro (8). 

In coronavirus M pro polyprotein cleavage 
sites, the PI position is invariably occupied 
by Gin. At the very bottom of the M pro SI 
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Fig. 3. (A) Refined model of the TGEV M pro -bound hexapeptidyl CMK inhibitor built into electron 
density (2||Fol - IFc||, contoured at Ict above the mean). There was no density for the Cbz group 
or for the C(3 atom of the PI Gin. Inhibitor is shown in red, protein in gray. Cys 144 is yellow. (B) 
Inhibitors will bind to different coronavirus M pro s in an identical manner. A superimposition 
(stereo image) of the substrate-binding regions of the free enzymes of HCoV M pro (blue) and 
SARS-CoV M pro (gray) and of TGEV M pro (green) in complex with the hexapeptidyl CMK inhibitor 
(red) is shown. The covalent bond between the inhibitor and Cys 144 of TGEV M pro is in purple. 
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subsite, the imidazole of His 162 is suitably 
positioned to interact with the PI glutamine 
side chain (Fig. 3, A and B). The required 
neutral state of His 162 over a broad pFI range 
appears to be maintained by two important 
interactions: (i) stacking onto the phenyl ring 
of Phe 139 and (ii) accepting a hydrogen bond 
from the hydroxyl group of the buried Tyr 160 . 
In agreement with this structural interpreta¬ 
tion, any replacement of His 162 completely 
abolishes the proteolytic activity of HCoV 
229E and feline corona virus (FIPV) M pro (75, 
20). Furthermore, FIPV M pro Tyr 160 mutants 
have their proteolytic activity reduced by a 
factor of >30 (20). All of these residues are 
conserved in SARS-CoV M pro and, in fact, 
in all coronavirus main proteinases. Other 
elements involved in the SI pocket of the 
M pro are the main-chain atoms of lie 51 , 
Leu 164 , Glu 165 , and His 171 . In SARS-CoV 
M pro , lie 51 becomes Pro and Leu 164 is Met, 
although this is less relevant because these 
residues contribute to the subsite with their 
main-chain atoms only (Fig. 3B; side 
chains involved in specificity sites are 
marked by “A” in Fig. IB). 

Apart from a few exceptions, coronavirus 
M pro cleavage sites have a Leu residue in the 
P2 position (9). The hydrophobic S2 sub site 
of the proteinase is formed by the side chains 
of Leu 164 , lie 51 , Thr 47 , His 41 , and Tyr 53 . The 
corresponding residues in SARS-CoV M pro 
are Met, Pro, Asp, His, and Tyr. In addi¬ 
tion, residues 186 to 188 line the S2 subsite 
with some of their main-chain atoms. The 
Leu side chain of the inhibitor is well ac¬ 
commodated in this pocket. It is notewor¬ 
thy that SARS-CoV M pro has an alanine 
residue (Ala 46 ) inserted in the loop between 
His 41 and lie 51 , but this is easily accommo¬ 


dated in the structural model and does not 
change the size or chemical properties of 
the S2 specificity site. 

There is no specificity for any particular 
side chain at the P3 position of coronavirus 
M pro cleavage sites. This agrees with the P3 
side chain of our substrate analog being ori¬ 
ented toward bulk solvent. At the P4 position, 
there has to be a small amino acid residue 
such as Ser, Thr, Val, or Pro because of the 
congested cavity formed by the side chains 
of Leu 164 , Leu 166 , and Gin 191 , as well as 
the main-chain atoms of Ser 189 . These are con¬ 
served or conservatively substituted 
(Leu 164 —> Met 164 , Ser 189 ->Thr 189 ) in SARS- 
CoV M pro . The P5 Asn side chain interacts with 
the main chain at Gly 167 , Ser 189 , and Gin 191 
(Pro, Thr, and Gin in the SARS-CoV enzyme), 
thus involving the loop linking domains II and 
III; whereas the P6 Val residue is not in contact 
with the protein. Although the inhibitor used in 
the present study does not include a PI' residue, 
it is easily seen that the common small PI' 
residues (Ser, Ala, or Gly) can be easily accom¬ 
modated in the SI' subsite of TGEV M pro 
formed by Leu 27 , His 41 , and Thr 47 , with the 
latter two residues also being involved in the S2 
subsite (Leu, His, and Asp in SARS-CoV 
M pro ). Superimposition of the structures of the 
TGEV M pro -CMK complex and the free en¬ 
zyme of HCoV 229E M pro shows that the two 
substrate-binding sites are basically the same 
(Fig. 3B). All residues along the P site of the 
cleft are identical, with the exception of the 
conservative Met 190 —> Leu 190 replacement (Ala 
in SARS-CoV M pro ). In other coronavims 
species, including the SARS pathogen, M pro 
residues 167 and 187 to 189 show some sub¬ 
stitutions, but because these residues contribute 
to substrate binding with their main-chain at¬ 



Fig. 4. Derivatives of the antirhinoviral drug AG7088 should inhibit coronavirus M pro s. A superim¬ 
position (stereo image) of the substrate-binding regions of TGEV M pro (marine) in complex with the 
hexapeptidyl CMK inhibitor (red) and HRV2 3C pro (green) in complex with the inhibitor AG7088 
(yellow) is shown. 


oms only, the identity of the side chains is 
less important. Indeed, the substrate-binding 
site of the SARS-CoV M pro model matches 
those of its TGEV and HCoV 229E counter¬ 
parts perfectly (Fig. 3B). Thus, there is no 
doubt that the CMK inhibitor will bind to the 
HCoV 229E M pro and SARS-CoV M pro , as 
well as to all other coronavirus homologs, with 
similar affinity and in the same way as it does to 
TGEV M pro . 

This proposal as well as the correctness of 
our structural model for SARS-CoV M pro are 
strongly supported by cleavage experiments 
that we carried out with the recombinant 
SARS virus enzyme (18) and the peptide 
H 2 N-VSVNSTLQ i SGLRKMA-COOH 
(21). This peptide, which represents the N- 
terminal autoprocessing site of TGEV M pro 
[the cleavage site is indicated by ! (Fig. 
IB)] and contains the sequence of our CMK 
inhibitor, is efficiently cleaved by SARS- 
CoV M pro but not by its inactive catalytic- 
site mutant Cys 145 ^Ala 145 (fig. SI). 

Although peptidyl CMK inhibitors them¬ 
selves are not useful as drugs because of their 
high reactivity and their sensitivity to cleav¬ 
age by gastric and enteric proteinases, they 
are excellent substrate mimetics. With the 
CMK template structure at hand, we com¬ 
pared the binding mechanism to that seen in 
the distantly related picornavirus 3C pro¬ 
teinases (3C pro s). The latter enzymes have 
a chymotrypsin-related structure, similar to 
domains I and II of HCoV 229E M pro , 
although some of the secondary-structure 
elements are arranged differently, making 
structural alignment difficult (sequence 
identity <10%). Also, they completely lack 
a counterpart to domain III of coronavirus 
M pro s. Nevertheless, the substrate specific¬ 
ity of picornavirus 3C pro s (22, 23) for the 
PI', PI, and P4 sites is very similar to that 
of the coronavirus M pro s (9). As shown in 
Fig. 4, we found similar interactions be¬ 
tween inhibitor and enzyme in the case of 
the human rhinovirus (HRV) serotype 2 
3C pro in complex with AG7088 (Scheme 
1), an inhibitor carrying a vinylogous ethyl 


H 



P2 = p-fluoro-benzyl: AG7088 


Scheme 1. 

ester instead of a CMK group (13). Only 
parts of the two structures can be spatially 
superimposed (with a rms deviation of 2.10 
A for 134 pairs of Ca positions out of the 
— 180 residues in domains I and II). Both 
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inhibitors—the hexapeptidyl CMK and 
AG7088—bind to their respective target 
proteinases through formation of an antipa¬ 
rallel (B sheet with strand ell (Fig. 4). How¬ 
ever, completely different segments of the 
polypeptide chain interact with the sub¬ 
strate analogs on the opposite site: residues 
188 to 191 of the loop connecting domains 
II and III in M pro , as opposed to the short 
P-strand 126 to 128 in HRV 3C pro . As a 
result, the architectures of the S2 subsites 
are entirely different between the two en¬ 
zymes; hence, the different specificities for 
the P2 residues of the substrates (Leu ver¬ 
sus Phe). The inhibitor AG7088 has a p- 
fluorophenylalanine side chain (p-fluoroben- 
zyl) in this position. Based on molecular 
modeling, we believe that this side chain 
might be too long to fit into the S2 pocket of 
coronavirus M pro , but an unmodified benzyl 
group would probably fit, as evidenced by 
Phe occuring in the P2 position of the C- 
terminal autocleavage site of the SARS coro¬ 
navirus enzyme (Fig. IB and table SI). Apart 
from this difference, the superimposition of 
the two complexes (Fig. 4) suggests that the 
PI and P4 residues of AG7088 (a lactone 
derivative of glutamine, and 5-methyl- 
isoxazole-3-carbonyl, respectively) can be 
easily accommodated by the coronavirus 
M pro . Thus, AG7088 could well serve as a 
starting point for modifications that should 
quickly lead to an efficient and bioavailable 
inhibitor for coronavirus main proteinases. 

The 3D structures for coronavirus main 
proteinases presented here provide a solid 
basis for the design of anticoronaviral 
drugs. The binding modes of substrates and 
peptidic inhibitors are revealed by the crys¬ 
tal structure of TGEV M pro in complex with 
the hexapeptidyl CMK. In spite of large 
differences in binding site architecture of 
the target enzymes, compound AG7088 


binds to human rhinovirus 3C pro in much 
the same orientation as seen for the CMK 
compound in the binding site of TGEV 
M pro . This finding indicates that derivatives 
of AG7088 might be good starting points 
for the design of anticoronaviral drugs. Be¬ 
cause AG7088 has already been clinically 
tested for treatment of the common cold 
(targeted at rhinovirus 3C pro ), and because 
there are no cellular proteinases with which 
the inhibitors could interfere, prospects for 
developing broad-spectrum antiviral drugs 
on the basis of the structures presented here 
are good. Such drugs can be expected to be 
active against several viral proteinases ex¬ 
hibiting Gin l (Ser,Ala,Gly) specificity, in¬ 
cluding the SARS coronavirus enzyme. 
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